Synthetical Enlargement of Mfcc Based Training Sets for Emotion Recognition
نویسندگان
چکیده
Emotional state recognition through speech is being a very interesting research topic nowadays. Using subliminal information of speech, it is possible to recognize the emotional state of the person. One of the main problems in the design of automatic emotion recognition systems is the small number of available patterns. This fact makes the learning process more difficult, due to the generalization problems that arise under these conditions. In this work we propose a solution to this problem consisting in enlarging the training set through the creation the new virtual patterns. In the case of emotional speech, most of the emotional information is included in speed and pitch variations. So, a change in the average pitch that does not modify neither the speed nor the pitch variations does not affect the expressed emotion. Thus, we use this prior information in order to create new patterns applying a pitch shift modification in the feature extraction process of the classification system. For this purpose, we propose a frequency scaling modification of the Mel Frequency Cepstral Coefficients, used to classify the emotion. This proposed process allows us to synthetically increase the number of available patterns in thetraining set, thus increasing the generalization capability of the system and reducing the test error.
منابع مشابه
Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملSpeech Emotion Recognition Using Residual Phase and MFCC Features
Abstract--The main objective of this research is to develop a speech emotion recognition system using residual phase and MFCC features with autoassociative neural network (AANN). The speech emotion recognition system classifies the speech emotion into predefined categories such as anger, fear, happy, neutral or sad. The proposed technique for speech emotion recognition (SER) has two phases : Fe...
متن کاملEmotion Recognition using Dynamic Time Warping Technique for Isolated Words
Emotion recognition helps to recognize the internal expressions of the individuals from the speech database. In this paper, Dynamic time warping (DTW) technique is utilized to recognize speaker independent Emotion recognition based on 39 MFCC features. A large audio of around 960 samples of isolated words of five different emotions are collected and recorded at 20 to 300 KHz sampling frequency....
متن کاملMFCC based Enlargement of the Training Set for Emotion Recognition in Speech
Emotional state recognition through speech is being a very interesting research topic nowadays. Using subliminal information of speech, denominated as “prosody”, it is possible to recognize the emotional state of the person. One of the main problems in the design of automatic emotion recognition systems is the small number of available patterns. This fact makes the learning process more difficu...
متن کاملInferring the Human Emotional State of Mind using Assymetric Distrubution
This present paper highlights a methodology for Emotion Recognition based on Skew Symmetric Gaussian Mixture Model classifier and MFCC-SDC ceptral coefficients as the features for the recognition of various emotions from the generated data-set of emotional voices belonging to students of both genders in GITAM University. For training and testing of the developed methodology, the data collection...
متن کامل